Software Methods to Increase Data Cache Performance
نویسنده
چکیده
Cache performance is critical to the overall performance of modern CPUs. In most processors, cycle time is almost entirely determined by the cache hit time. This places practical limits on the complexity of cache controllers. Because of the growing gap between CPU performance and memory access times, the cost of a cache miss in CPU cycles is growing steadily [1]. As a result, it becomes critical to use caches effectively. More complex replacement schemes have been proposed to predict which data is most likely to be useful in the cache [8]. In addition, complex hardware prefetch schemes have been developed to predict which data is likely to be needed in the future [3,6,7]. However, such modifications are likely to increase the hit time of the cache, and hence the cycle time of the processor. Of particular interest are software schemes to improve cache performance [2,4,5], as they require little or no modifications to the instruction set architecture. In particular, they do not greatly increase the complexity of the cache controller. They also hold the potential to be mined by compilers, having essentially no negative effect on the cost of software development. This paper will examine several such schemes and present their relative costs and benefits. Loop fusion increases the temporal locality of accesses by combining multiple loops that access the same elements. Array merging increases spatial locality by combining multiple arrays whose elements are always accessed together. Cache prefetch reduces cache misses by ensuring that the data is in the cache in advance of when it is needed. The effect of these techniques on example programs is examined.
منابع مشابه
Optimizing Performance in Highly Utilized Multicores with Intelligent Prefetching
Khan, M. 2016. Optimizing Performance in Highly Utilized Multicores with Intelligent Prefetching. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1335. 54 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9450-6. Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefetching, to increase pe...
متن کاملControlling Cache Pollution in Prefetching With Software-assisted Cache Replacement
Aggressive prefetch methods can suffer from cache pollution when prefetched data replaces useful data in the cache, causing performance degradation. In this paper, we present a methodology that ensures that cache pollution does not degrade overall performance when software or hardware prefetching methods are used. Software instructions can allow a program to kill a particular cache element, i.e...
متن کاملSoftware Assistance for Data Caches
Hardware and software cache optimizations are active elds of research, that have yielded powerful but occasionally complex designs and algorithms. The purpose of this paper is to investigate the performance of combined though simple software and hardware optimizations. Because current caches provide little exibility for exploiting temporal and spatial locality, two hardware modiications are pro...
متن کاملREMcode: relocating embedded code for improving system efficiency - Computers and Digital Techniques, IEE Proceedings-
The memory hierarchy subsystem has a significant impact on performance and energy consumption of an embedded system. Methods which increase the hit ratio of the cache hierarchy will typically enhance the performance and reduce the embedded system’s total energy consumption. This is mainly due to reduced cache-to-memory bus transactions, fewer main memory accesses and fewer processor waiting cyc...
متن کاملEnhancing the Performance of OpenLDAP Directory Server with Multiple Caching
Directory is a specialized data store optimized for efficient information retrieval which has standard information model, naming scheme, and access protocol for interoperability over network. It stores critical data such as user, resource, and policy information in the enterprise computing environment. This paper presents a performance driven design of a transactional backend of the OpenLDAP op...
متن کامل